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FIELD Of= THE INVENTION 
BACKGROUND OP THE INVENTION 

♦ larae extent In the last few 

Trtdav main stream for handling text ™ es J* ^jr! Hat5l u ase the development and the 
Sffi ™« based »n « ^^^^n^Slto,, ,nd »Ppl-«o„ «. 

• , ,^h™l«.l for professional mrousit, bus'"'* 5 » 

aJI dteap^ointing. This is doe to taw reasons: ^ 
! Ration in the Internet is spread al, ^^V^XS* ^ ^ « 

information. . nf 

,e va/oIi of the e hmmation ot 

2 T»e ma» n o< b. awa.a o. ~w ln*xn«««> «~— 33 

i xni« results in unnecessary multiple 
5 Search through different servers is not correlated. Th,s results 
^e^fof^eslrne data from different servers. 



««r can b- communicated by many users. 
, ncpat No 4 774 055 Kollin offers a syscem tha : ca id- provides a reference to 

dataoases rei* n d reference librarian., w tne 



database librarian: method and 

■ ,h« problems descnbed above by proving a new method 
the present invention solves the problems oe 
apparatus eompnsmg - 

not on a resources list. 

. Vo frorn different sources, 

t^ssssasssa^ — 

interest. 
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SUMMARY OF THE INVENTION 
^o^veourrepre^^^^^ 

♦ • «^inn ^ search and retrieval method and apparatus are 
tn accordance with the present mvenbon "ja™» na 
provided, comprising at least some of the .ol.ow.nfl. 



. Client Station 



Client Reference Database . 

: 3523S£ SffSSTSrt - »• ,M " g " ,eas< 

! 0n * L Si of .smes =. hosB .po^ec '•»-»» P»" *» " SUCCeSS an<i 
failure accumulated for these hosts. 

Number of returns to a ^^^.^^'5^'. client, based on 
Host ranking that represent the value 01 a nostra 

past success statistics. a searc h query including at 

A Query Module having an interface for subm.ctmg a searcn q 

least some of: . . • ► 

E au'eaS ofetSmet URL (or equivaient in a simi.ar system) 
. Selectable classified subjects 

relations, statistical relations and thesaurus 
Data base updating proceoures 



2. LocaVCIient Server 



Local/Client Server database _ hiects list (consolidation of the data bases from 

. Number of returns to s host for documents retrieval 
. Directory for approved documents 

: SSESfS^^M^. — 

. A „s, of '^^SSrof^.ngin. at r.rr,o« server (Worms*.,, 

. ESS***** » .? • — subiecB " sl < " ,d * < ' 

f^'S*?*™*** «duc»o„ C . boolean expreseion 
Local search module 
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3 * group of unrelated remote se^rs - c search ^ 

h remote server is e^pped ^ engine " 

" £aCh S .nd * om the C " et>t ^ then sent to the client. 

,T h . Boolean expresses broken 

relevant to these Key y,ords- 

TiSrt 7nd I submits the qusry- these reS ults 

second host ana -we d =t the client Tnen .. 

catena. distributed and u n / elate jL ^5 

• |S applicable, m out 

data bases. ^ , n ref erence » the\^« , n 
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, k Wais - Wide Area Information Severs. 

3nalogUeS - M scribed in details in this document 

^^^^^ 

results are provided, the user m*y 



^TAILED DESCRIPTION OF THE INVENTION 

OETAILcU uc „,„,„ of information that St« 

. -internet Architecture Board (IAB) puouca ^ QfQcesS 



invention, 



l^M^"^ ^ , rtM , , 04 is pfOV ided on a client station. 



spend only 
contain 
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* and. or. not .boolean relations 

• free text expressions 

^ ufer K?SS gS (Graphic User Interface, 202 to compose a query such as fwago, 
or ve/7/e/ej and ra// and Transtech 96' not road. caecific 
The *erm Transtech 96' is a free text element be.ng, for example, a name of 3 .pacific 

S^Q^r assorted GU. may be more complex and ,t may be used in a vanety of 
ways. 

More search features that may be available in the GUI of the QM: 
. Setting the maximum number of results to be retrieved. 
. A choice far searching through any Selectable combination of: 
titles, abstracts, full text, author 
or any other relevant section of the document^ re | eV ance that will be 

• A data relevance minimum level is set to indicate ine ran a = 

. XSSL of a URL to be used as a starting point for URUhypertext based search. 
. A selSoJ of one or more subject out of a classified subject list. 

When the user has finished to compose the W^^^^™**"- *" ^ 
a "submif menu item to forward the query to the Query Interpreter. „Qb. 

2. The Client Reference Database 

The concept of a Client Reference Database (CRD) is shown by J ^J^ 3t 4 . 774 ' 655 ' 
S3 SiSpt is described here in details to provide the basis to the descnption or 
embodiments of the present invention, among them are_ 

. Automatic generation and maintenance of a CRtj 

. Automatic selection of hosts for search from the oku. 

A, opened ^ov. .« <°~«^*£!SZ^XX^ 

associated with. For example, he may select tiecincai vb(huw 
illustration of classified subject list 



Transportation 
Land 



Power sources 

Liquid fuel vehicles 
FiPEtrical vehicles 
Cars 

Personal 

Family 

Tracks 

Trains 
Gas driven vehicles 
Road vehicles 
Off track vehicles 
Rails 

Statistical data 



Air 
Sea 



boolean option and use only the classified subject list. 

The CRD. 208 in Figure 2. contains at least or* of ft* following elements: 
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. classified subject list _ 

- Host characterization; 

List of available services from each host « PIV ifies 
Parameters required for working with that host and it s services, 
information on using the host search engine. 

Other useful utilities may be provided to expand the CRD such as: 

• Thesaurus 

• Speller 

<subject> [<host name>/<ralevance rankings ...] ... . 

service available from that host (sssunmna nc » «^P iete 
overlap of the information available through d.frerent 

fn Inscription, host name and service will be indicated in 
the form "hostl 25". 
< ra ievance rank|n ^ : indicat es the relevance of the specific host to the 

IS* "Since ranking of a host is based on rankmg 
algorithms such accounting how many times the sub ect is 

host. 

Transportation fiartW** Hart*. ^f^^ 

hostG54/3 ho$t34/1, host935/1] ^o/*^/"37 

Liquid fuel vehicles 
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Electrical vehicles [host$3/10. host1/10. hosl867/8. 
hostl55/5] 

Cars [hast33/10. host1/10, host155/5. host867/3] 
Trains 
Gas driven vehicles 

it is a particular embodiment of the present invention to associate hosts to all levels of the 
CRD thereby, saving the user the need to make a multiple level subject selection, 
throughout the end of the subjects branch. It is also a particular embodiment of the present 
invention, that a host may listed and unlisted along the different levels, having different 
ranking at each level, as described hereinbelow. 

Looking at the subject 'Transportation", a list of 8 hosts is associated with it. These are at 
least some the hosts, out of a larger list of accessible hosts, that are ranked 1 or higher 
using the ranking process. Using this list, hostS67 would be the first host to be searched for 
information. Then only if there is not enough information found, hostl will be searched, and 

ff wTlook at the subject "Land", which is actually 'Transportation. Land ' we can see that for 
that category. hosts34 and host955 are not relevant anymore. Being ranked under 1 mey are 
deleted from the list. Also, the rank of host33 has increased from 7 to 10 while host867 
decreased to S 

Similar structure is applied throughout the classified subject list. The selection of "Electrical 
vehicles- wfiich is actually a selection of 'Transportation, Land, Power sources. Electrical 
vehicles" is associated with a relative narrow and focused group of hosts to search ;n. 

It will be appreciated that selecting "Electrical vehicle" under a different category such as 
■■Environmental. Pollution, Transportation. Electrical vehicles" would not necessarily have the 
same hosts list associated with it. The ranking and the hosts names may be different 

Relevant hosts may be located using hypertext-URL search, robots (see for example 
http7/info.webcawler.com/mak/projects/robots/robots.html), manual browsing published 
information in professional literature ect. Updating of the CRD may be automatic or manual 
as descnbed hereinbelow. 



It will be appreciated that updating of CRD may be done at least at one specific host, 
supported by a team specializing in studying the Internet and upgrading such a CRD. The 
client may keep updated by receiving updated CRD from such a host on a regular base by 
an e-mail notification for update. The client may also initiate the update following a routine 

Chick tart update of the host CRD. If update has been made and at least 2 weeks has 
passed from the previous update - download the new CRD to your station. 

3^ Search Query Interpretation 

The search query is submitted to the Query Interpreter (Ql), 206 of Figure 2. In the present 
example it contains: 
I - The boolean expression: 

/wagon or vehicle) and (rail) and (Transtech 96) not (road). 
• The selected subject (or subjects): , n 

i •Transportation, Land, Power sources. Electrical vehicles 

with its associates hosts list 

[host83/l0. host1/10. hosia67/8, host155/5] 

! Another preferred embodiments of the invention that either do not use a subject selection or 

I use only subject selection will be described hereinbelow. 

i 



i 
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In order to retrieve documents from a host, the Q! generates modified sets of search queries, 
reorganized in a format suitable to that host Each one of the modified queries, is submitted 
to the Net Communication Module (NCM), 210, a software module that ^designed to send 
the query to the selected host (and receive the response from that host). This is shown in 
Figure 2 as a communication path of NCM 210 to net 21 2 which is connected to many hosts 
and servers 214. 

The NCM will not be described in details here. There are equivalent software elements in 
Internet browsing software using CGI (Common Gateway Interface). One such example is 
Navigator from Netscape Communications Corporation of Mount View. California, another 
example is Mosaic from the University of Illinois. Also a variety of development tools are 
available to generate NCM applications such as The Internet Application Framework from 
Netscape Communications Corporation and developers tools from Spyglass Inc. of 
Naperville, Illinois. .... , . 

It is appreciated that the development of such a module is a common task for those skilled in 

the art. 

3,1 Search Process 

Since, as demonstrated in the present example, the user specified a subject for the 
search orocess, this will dominant the first phase of search. 

In this example, using the classified subjects list ranking of hosts, hostB3 will be the firs, 
choice for search. 

In a preferred embodiment of the invention, host83 enables processing only a single 
term with its search engine. That is, the boolean relations such as AND, OR and NOT 
3 re unrecognizable by the search engine of that host To utilize such a host without 
reducing the benefits of a complex boolean query, the Qi submits 5 separate search 
quanes. one query for each of the terms; 

wagon, vehicle, rath Transtech and 96. 
The term Transtech 96 is split to Transtech and Q6 since host 83 does not support a 
submission of anything but a single word. , . 

The tefm road is not usefyi a t this stage as it may result in documents containing oniy 
the word road, documents in which the user is definitely not interested. The word road is 
identified as such by the preceding boolean relation not 

It wil) be appreciated that this method may be used also whan the search boolean 
specifications of the host are not available to the QI. 

A common denominator for all search engine is the search of asmgle term. In different 
databases however, the format for submitting the term may be different. This is 
problem is limited if the search is directed to all hosts capable of handling HTML and/or 
other widely used format 

Each query will invoke a search process In host83 l returning a list of titles, abstracts or 
any other information requested from that host Full text can be automatically re tneved 
by subsequent communications with the host The text is referred Wsfry the URLs that 
come with the data from host83 in the preceding search process. Most Internet search 
engines provide a list of titles of relevant documents and locations The URLs are 
associated with the titles in that list This can enable a search on the level cf full-text 
without the user interference. The number of data elements will be limited by the 
relevance level (set by the user during the query composition) or by an internal default 
limit of the NCM. 

When a predetermined amount of information has been retrieved by se nding^ the 5 
queries to host83 i the Client Search Engine (CSE) is set to work (216). The 'CSE search 
in the data using the complete boolean expression (wagon or vehicle) and (rail) and 
(Twaiech 96) not (road). This is possible since the CSE does not have the limitations 
that the search engine of host83 may have. - 
At the end of this process, only documents that satisfy the anginal boolean expression 
are identified as relevant 
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This is used to awiJ 

repeatable PW^JjVj maintained m ^ e abstract. fhe> aSSO ciated 

, 0 thls •W'jJjSSn relation, is ava.iable. 
selectors of w° DOW , ^ 



ra/7 and T ratteen * ranking, basea on 



been acquired. fem the oiu 



been acquired. ^ jj**^^*, base. The 

K . _^ ,v./illhec 



can restartthe a ^£7S spla ying the results. re ference to 

the point it stopped for d,sp y ^ stepS ,„ 

« described hereinabove wvli be desc 
The process stescno _ _ items 
Pg U re3-. — action of one or ^ 



Tne process desert - foneor more items 

, . The user ^^iSTnurnber of m^mum rtem 

parar neter (302). ^ ^ Que ry -ntecpr^ 0 

, rA i fi ted parameters are 
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- - ,c filtered and ranked £161 

mutated data from the set of host-comp 
7 The acajmuwe 2) evaluation 

using the CSE ^ ° sausfying PV»>- ' j .terns 

10 lf r e S olts are not «^J n ^, n , e mfc>od.ment of tne u 

automatic search. 1 ne p ^ ^ ^ 

URLs ^ soC,at ^ that refer to the URLs to Key frany cases the <W aCCO rding to 

sentence ^^^Sart of the document may have ^ fae 
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- i id! u*t The process of 

err— i---------rr 

Classified sublets list based search: _^ ^ fey 

Electrical vehicles m the ciassir 

be referred to and searched. ^ search 

3^ by URUS and the user may choose not to 

^ example, the ^^^2^^*^ be u-d). 

select maximum number of docum . ^specify any subject of the 

Ossified subject «st and s may start by au tomjh^tfj clasS ifled 

«st. in such «J j;^, is done by ^"J^JSS Altar preparing the 

classified subject l-stof ■J*^ |istof 3 reference database that iated , nformBt0 n 

more then one branch of the classmen session, a 

spline search. , n a ho st that has already been searched ,n «fl«J a(ready be en 

searched, such as a log file- ^ a , so on URLs, 

again , - K^iment of the invention, this Srch rnay refer to the same 

1n . P«»^^^SSi7tinc. a URUIW*** e r a closed loop. 
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!l _ lj _ ! Mii) l ii | i fittini ff "1 *" biec * lis -* 

The URUhyperte* based search the dependency on a CRD as descnbed 

Hsted.Ugth^ 

I VSS Se^ch Engine (http://vvvvw.yahoc.com) 

, WebCrawler ^^://we^ravvler com) a UR uh yp ertext 

One of such hosts may be used as »• ^tKsf^e retrieved documents are 

Sased search. The modified query ^SjSS^P^Sm takes place, using the filtered 
filtered locally using the <^^^gS£ further search. The process conbnues as 
documents, to 9 e " erat9 ^ r ^u 'ffhvpe^t based search. 

described hereinabove, for the URL/hypeite«o fURLsis 

,„ another embodiment of the invention th; ^^ja^iK^^^ - 
done successively on all the "general purpose databases according t0 the 

5 ^-^hingkithr"^ ^ a reh query 

search will be evoked, ^biects in the classified list is interpreted to 

the method: 

Transportation 
Land 

Power sources 

Liquid fuel vehicles 

gtor-trlcal vehicles _ . 

Cars 

personal 

Family 

Tracks 

Trains 

Gas driven vehicles , ; 
Road vehicles 
Off track vehicles 
Rails 

Statistical data 

Air 
Sea 

a,*, .1 ssaflesLss^"'' «-— * <"-*" °" he ^ „ ^ ,„ 

adjacency of words. 

• ..^oorlthmsmaybeusedforcompos-ngaqueryrromaseiectedsu^ectone 
jSr^fflSE in the above example: 



13 



EITfiN PEARL LPTZER+COHEN-ZEDEK 



, oni80 



. ™ «ecified a s adjacent terms enclosed in 
, . Alt bywords that confute a level are specf, 

3. AH the terms in the branch • <£* a ^ subject itself, are m the query. . 

selected subject, and the term & ^ ^ 

, mav be used by adding truncation operators and 9 



operators assume an extreme — — 

m this example we sua" assume or a/7 d and (J. 

,t is also assumed jn a searC h session. 

processing o^Y a • Q f the following steps: ^-..-..p. expressions). 

3 Generate a set of query, each q vy .^ racW no { fear and race) w.H 

search in a separate searcn qu 
complete query expression. 

2J^iS^£^^ e ^nt a variety of characteristics: 

. An interest group may wje 

results of all gf° u P m *^!L t i cal jy generated. 
. An ewCRDmaybe 9 utoma n ca.yo future of updating a CRD according 

, h„„ . CRO is provided by the following steps: 
An example of updating a CRO is pr boolean 

rL-.«a.«---" *'" ank,n8 . 
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3 For each such host, calculate a new ranking using a formula; 

MHR - OHR x ( 0.3xRF * 0.4x(4xAF) + Q.3x(NHxRN)) 

where 

NHR^New Host Ranking, 

OHR- 0\<i Host Ranking. 

RF= [Averi*ge(Rank i)]/OHR r5 ,nkinas of the documents received 

contribute to increase the ranking of the host. 
AF= -(number of documents approvedMctal I™"*" by the user . T he 

contribution to increase the host ranking. 
NH- Number of hosts searched, 

£ N - (number of relevant documents from that host)/(number of relevant 
documents from all hostel De r cent age of documents approved 

the NHR. 

OHR-HRo. HRo being the r~™J^^ the ciassified 

hro - " d hostto * e CR0 - 

Based on exisflng standards such ^^^S^^^^^ 
automatically retrieved from the receded data. For example, a rece _ . 
searched for specific expressions such as 

<;form actions" A"> and <inputtype=submitvalue=B> 
The URL address A and the value B may be tested 
^^^^^ 

saved automatically in the CRD. 

characteristics such as; v 
. The expression search w msulf w </W/e> . 

• The expression match'. 

* The expression camera*. 

representing an input page to such a search eng.ne. 
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lnvetanomerembodi m e*, t .he^ 

having the same interest (.merest group - such as > «™»^?» ™ * ^ 6enems ,„ m 



the shared CRD. 



in another embodiment 

specialists. CR01 may have the advantage « < " ^„ ^Jj. „.„ hosts that have 

information, auch as m. Interne t The , "'l"}™^^™^,^ Boolean search 
been connected to th. net. .ndud.ng their »P»^f n 3"^" a ," e , 5 information source. It 
capabilities, available formats I ™* "»£ ^vSd,™ the net. to avoid 

insert, *>nem .mm m. global S^oS^Sd. 

to thai client and the clients CRD is a subgroup of CRD1 . 

1 Lead CRD1 into the local station. hraneh 

! : reKts«^^ 

subject copy the relevant details to clients CRD. 

,„ « second e«mp,e. ^ h ^*^^^^3SS ESS* 

for all branches of CRD1 . Through this procedure, the client nas « p 
to his own history of hosts ranking. 

,„ ar.ctt.er embodiment o, me Inventon. The <*^%S£Z% SSoTc&E 
Labases which are search by a client 

locations that have no search ""^r^Ti ?uRLmyperte*t document that 

° n,SUC \^Zeb'mit^u;af 6 /athena. m itedu/o,gWr^nt/ ;cop»tight.listhtml 

This URL '^ o ^~^"p t ^|.TR6S0URCES TOR THE INVENTOR" 

me documen?^^^, URLs, and , short content descnp.on wh.ch ,. 

done locally at the clients station. 
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u-nn» <URL>/< relevance ranKing>, ..1 
/^relevance ranking'. - ur ^ 
< 5U bject> [<hostn a meW-relevan<; 

^ „ lMn .„ UR ,s — *. - - 

. Fiesta" hosts according »»«M>e a » hosts according to th*i' 

relevance ranking. no searC h is 

• . ■ that the CRD updating process may 
It w,H be apprecatao that £ e^KU 

^anuallv controlled or completely m generate a 

■ * „ the CRD updating method can be used tog g a 

« W.« 3 >so be -P^^S^SSlng to the above men * onjdJjP ^ h ;o 



a^Silnaan^na^^ . _ __ owed by each of the 

^etanotherembodimentofth^ 

this kind of service to such an in received documents 

?S,rs;r P ss » — . compuMrs (NC) . 

and can be arranged in , mjny form bed configU rat,on. 

example and it is note limited to in« in the time that this 

.Wo. was — u 
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nrp orovided as examples 
* ri«cnbed hereinabove are pro^ae 
■«w>h that the embodiments cJesc > descrip tion, 

T*e scope of the invenu 

9 Th« - ~ """""" ' " op—, 'NOT «* „ 

The method accordmg any 
^formation retried from m ^ ^ 0 , 

* jSg^Sfcts ~ - 

K *P «* * sionJsearch engine and; 

is acceptable by the s^o . Qn s()UrCeS . 

a reference database or 

The method accords to daim 9 ^lud.ng: ^ ^ 



10. 

classified subject l.st 



,3 The method according to any of cla.ms . 9 

, 4 The method according to any of claims 9 trough 
W*» reference databases. 

—-«——— ~* 

, r8ft.no. AHW « .nform.t.- ln(orma tlon 



BESS- - * — - - *~ °~' ean 

delation 'W (adjacency), and. atof ■ ANrJ . 

relating all the enclosed terms by the 
,_ 0 . a method for search * in— in a i— source 
hereinabove. 
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Perform URL/hypertext based search 
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Perform classified list based search 



Display results | 
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